feat(hesai): add CUDA-accelerated point cloud decoder#421
Open
k1832 wants to merge 5 commits intotier4:mainfrom
Open
feat(hesai): add CUDA-accelerated point cloud decoder#421k1832 wants to merge 5 commits intotier4:mainfrom
k1832 wants to merge 5 commits intotier4:mainfrom
Conversation
580316f to
cd2b0e8
Compare
Add a GPU decode path for Hesai LiDAR sensors, gated behind compile-time BUILD_CUDA=ON and runtime NEBULA_USE_CUDA=1 environment variable. The implementation includes: - CUDA kernel for batched point cloud decoding (hesai_cuda_kernels.cu) - Angle LUT upload and GPU scan buffer management in hesai_decoder.hpp - GPU-vs-CPU equivalence tests for OT128 (Pandar128E4X) sensor The GPU path processes an entire scan in a single kernel launch, using pre-computed angle lookup tables and a sparse output buffer. When CUDA is not available or NEBULA_USE_CUDA is unset, the existing CPU path is used with zero overhead. Signed-off-by: Keita Morisaki <kmta1236@gmail.com>
- Copyright year 2024 -> 2026 for new files - Replace deprecated find_package(CUDA) with find_package(CUDAToolkit) - Remove --expt-relaxed-constexpr flag (not needed) - Remove unused per-packet kernel and launcher (dead code) - Batch launcher returns bool; caller logs via NEBULA_LOG_STREAM - Reorder CudaNebulaPoint fields for better memory packing - Remove redundant is_multi_frame member; use n_frames > 1 - Make HesaiCudaDecoder destructor virtual - Add int32_t range guarantee comment in angle corrector Signed-off-by: Keita Morisaki <kmta1236@gmail.com>
cd2b0e8 to
508175b
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #421 +/- ##
==========================================
+ Coverage 48.34% 48.36% +0.02%
==========================================
Files 156 157 +1
Lines 12996 13004 +8
Branches 6900 6903 +3
==========================================
+ Hits 6283 6290 +7
- Misses 5326 5327 +1
Partials 1387 1387
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
Replace .points access with direct iteration over PointCloud<T> (which now extends std::vector<T> instead of pcl::PointCloud). Signed-off-by: Keita Morisaki <kmta1236@gmail.com>
62ab94c to
09658fe
Compare
- Add missing #include <string> in hesai_decoder.hpp - Add missing #include <limits> in hesai_cuda_decoder_test.cpp - Fix readability/braces warning for ifdef-guarded else block Signed-off-by: Keita Morisaki <kmta1236@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR Type
Related Links
Description
Add a GPU-accelerated decode path for Hesai LiDAR sensors using CUDA. The feature is:
-DBUILD_CUDA=ON. When CUDA toolkit is not found, the build silently falls back to CPU-only.NEBULA_USE_CUDA=1environment variable. When unset, the existing CPU path is used with zero overhead.What it does
launch_decode_hesai_scan_batch)Files changed
hesai_cuda_kernels.cuhesai_cuda_decoder.hpphesai_decoder.hpphesai_sensor.hppmax_scan_buffer_points()for GPU buffer sizingangle_corrector_*.hppnebula_hesai_decoders/CMakeLists.txtnebula_hesai/CMakeLists.txthesai_cuda_decoder_test.cppKnown limitations
return_typefield (always 0)Review Procedure
Build (with CUDA)
Requires NVIDIA CUDA Toolkit (tested with CUDA 12.x). If the toolkit is not found, the build succeeds but CUDA support is silently disabled.
Running with CUDA enabled
The GPU decode path is gated by a runtime environment variable:
Test
Test results
Remarks
BUILD_CUDA=OFF), the 5 CUDA tests are compiled but skip at runtime viaGTEST_SKIP(), so they do not break CPU-only CI.Pre-Review Checklist for the PR Author
PR Author should check the checkboxes below when creating the PR.
Checklist for the PR Reviewer
Reviewers should check the checkboxes below before approval.
Post-Review Checklist for the PR Author
PR Author should check the checkboxes below before merging.
CI Checks